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An Exact Analysis of the Multistage Model Explaining 
Dose-Response Concavity 


Louis Anthony Cox, Jr.' 


Dett^mder 'J. 



The Jraditionai rrsulti^ase (MS) mode! of carcinogenesis .mplies several empirically testable prop¬ 
erties for dose-response fiinctjons. These include convex (linear or upward-curving) cumulative 
hazards as a function of dose; syrnmetrie erTects on lifetime rumor probability of transition rates 
at dirTercnt stages: cumulative hazard functions that increase without bound as stage-specific tran¬ 
sition rales increase without bound: and identical tumor probabilities for individuals with identical 
parameters and exposures. However, for at !ea.st some chemicals, cumuiative hazards are not con¬ 
vex functions ol dose. This paper shows that none of these predictec properties is implied by the 
mechanistic assumptions of the MS model itself. Instead, they anse from the simplifying Tare- 
rumor" approximations made in the usual mathematical analysis of the model. .4n alternative e.xact 
probabilistic analysis of the MS model wnh only two stages is presented, both for the usual case 
where a carcinogen acts on both stages simultaneously, and also for idealized iniliation-promotion 
experiments in which one stage at a time is affected. The exact two-stage mode! successfully hts 
o.oassay data for chemicals (e g., I .p-butadienei with concave cumulative hazard functions i.hat 
are not well-descnbed oy i.ae traditional .MS model. Oualitaitve properties of the exact rwo-siace 
inode, are describee and illustrated by least-squares fits to several real datasets. The major contn- 
pution IS to show- that properties of the traditional M5 model family that appear to be inconsistent 
with empincal data tor some chemicals can be e.xplained easily if an exact, rather than an approx¬ 
imate model, is used. This suggests that it may be worth using the exact model in cases where 
tumor rates are not negligible (e.g,, in which they exceed 10%). This includes the majonty of 
biOaSsay experiments currently being performed. 


KEY WORDS: Cancer dcse-i 
functions, carcinogenesis, 


1. tNTRODUCTION 

Upper confidence limits on unit nsk estimates pro¬ 
duced by the lineanzed multistage model are not in¬ 
tended to reflect all. or even the most important, sources 
of unceitainty, For example, they do not address the fol¬ 
lowing uncertainties, which may well dominate statisti¬ 
cal sampling errors; 

.Vfoefe/ uncenainty (is the multistage model 
appropriate for a given dataset?) 

' Co.x Associates. 503 FnuikfiR Street. Den'cr, Co'orado. 80518 


modeling. .MVK. model, multistage model, Two-stage model; hazard 


Inierindividuat parameter variabiiicv, resulting 
from the fact that pharmacokinetic and other parameters 
may be different for different individuals. 

Inienndividual srockasiic variability, arising from 
the fact that even identical individuals exposed to iden¬ 
tical dose histories will develop randomly-sized initiated 
cell populations, and thus will have randomly different 
nsks from carcinogens that act on initiated cells. 

This paper presents a new analysis of the two-stage 
model that addresses aspects of model uncertainty and 
intenndividuaJ stochastic variability. A useful modem sta¬ 
tistical treatment of interindividual parameter vanabilitv. 
with techniques for estimating both the population disen- 
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bution of a parametei vector that affects individual nsk 
and the conditionaJ individual nsk given the vaiue of the 
parameter vector, is presented by Mentre and Mallet.'" 

Three specific problems that challenge the useful¬ 
ness of the traditional multistage (MS) model are the 
following: 

(i) It does not appear to provide a useful fit to some 
important cancer dose-response datasets. For example, 
it cannot explain or describe cumulative hazard functions 
that are Concave functions of dose (see Fig. 1 and the 
discussion of it in Section 2). 

(iij It Ignores variability’ in the papulation distri¬ 
bution of individual risks. 

/Hi) It makes qualitatively incorrect mechanistic pre¬ 
dictions at high doses. For example, the MS model treats 
the transition rates (of cells from one stage to the next) 
for different stages symmerricaily (see Appendix!. If all 
stage-specific transition rales are positive, then the MS 
model allows the cumulative rumor hazard to become ar- 
bitrarily large as the transition rale for any stage becomes 
arbitrarily large. Thus, the predicted rumor probability ap¬ 
proaches I as any transition niie approaches infinity. 
However, the physical basis of the MS model implies a 
different conclusion; if only the transition rates m late 
stages are large, then the smaller transition rates at the 
earlier stages become rate-limiting. The cumulative tumor 
hazard and lifetime tumor probability then become insen¬ 
sitive to the iate-stage. large transition rates. 

This paper shows that all of these objections to the 
traditional MS model are attnbutable not to the model 
itself, but to the simplifying approximations usually 
made in analyzing it. The model can be analyzed without 
approximations, leading to an exact analytic solution dif¬ 
ferent from the usual (polynomial cumulative hazard) 
expression assumed in most multistage modeling soft¬ 
ware.'-' In a separate paper, we present a simple den- 
vation and expression of the exact solution for any 
number of stages in vector-matrix notation. This paper 
denves and interprets the e.xact solution for the special 
case of two stages, ■which suffice for many dose-re¬ 
sponse datasets. The exact solution corrects each of the 
preceding three problems, showing that the physical as¬ 
sumptions of the MS model are compatible with a va¬ 
riety of datasets le.g., those with concave cumulative 
hazard functions) even when the approximate mathe¬ 
matical model is not, 

2. CONCAVE CUML'L.ATIVE HAZARD 
FUNCTIONS 

A testable prediction of the MS model is that the 
cumulative hazard function M(.ti should be a convex (up¬ 


ward curving or linear) function of the dose, .t. Specif¬ 
ically, the traditional MS model has the form 

P{x) = 1 - exp(-(g„ * -t- q.x")] 

or. equivalently. 

i7(,r) = -ln[l - A-v)) = i?a -r + ... - 

where x ~ dose. Ptx/ = lifetime probability of tumor at 
dose .t. /f(.t) = —ln[! - Ptx)] = cumulative tumor haz¬ 
ard at do.se x. and </, = parameters of the dose-response 
relation (to be estimated from data), It is assumed that 
the parameters satisfy the constraints 

q ,>0 

This implies that //(x) must be a convex (upward-curv¬ 
ing or linear) function of .c, since each of the functions 
r* is convex function of .r, 

■A sirnple test for determining whether a given da¬ 
taset IS consistent with the multistage family of models 
IS therefore as follows: If the .-WS model holds, then, 
when Hix) is plotted against x. the resulting curve 
should be convex. Schell and Leysieffer (1989. p. 1120) 
provide a formal statistical test for the hypothesis of con¬ 
cavity of dose-response curves. A significantly concave 
Hix) curve would show that the algebraic form of the 
multistage model is inappropriate (misspecified) for the 
dataset, making estimates of its parameter values irrel¬ 
evant. Unless the concavity of H{x] can be e.xplamed 
away by a concave relation between administered dose 
and biologically effective dose — a real possibility when 
metabolic saturation occurs, as demonstrated by Travis 
et at '^' for tetrachloroethylene. but not plausible for non- 
sarurating dose levels — then a significantly concave 
Hix] curve must be interpreted as evidence that the MS 
model does not adequately describe the data. Other mod¬ 
els must then be sought to desenbe the dose-response 
relation. When a concave dose-response relation is ob¬ 
served, doubt that the MS model is applicable may un¬ 
dermine confidence in us predictions and conclusions— 
including its conclusions about upper confidence limits, 
w hich are predicated on llie correctness of the model. 

Enough data points are available for some chemi¬ 
cals at doses well below metabolic saturation levels to 
make a useful plot of x vs. H{x). One such chemical is 
1,3-butadiene, Melnick et al."" (Table 16.5) reported the 
data points shown in Table I for the dose-responses of 
male B6C3F1 mice exposed to butadiene for 2 years. 
The second (bottom) number in each pair gives the frac¬ 
tion of mice that developed tumors within 2 years. 
[.Above 200 ppm, an ecotropic retrovirus or other mech¬ 
anism IS activated that dramatically increases the inci¬ 
dence of lymphoid leukemias, leading to P = 0.86 at ,t 
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Tabic L Dose-[Response Data for Maje B6C3F1, Mice Exposed to 
Low and Modeme Concentrations of Butadiene for 2 Vears" 

.r (ppm) 0 6.25 20 62.5 200 

Pix) 0.35 0 55 0.62 0 68 _ 

j Sourca: MelnicV: ei a/."*’ 



IDO na 


EXPOSURE CONCENT1UT30N !>• PPM. x 


± Prob«eiUiy maligiuac tiunar sniliiji 2 >«.an 

. = Comui^ti^e Qirare for aiaJt^AQi nimor sriiam 2 vun 

CumuiaijTe oaurd s -lall * probatfUilv) 

Fig. 1. \ concave cumulative hazard dose'-response curve for t.3- 
I buudiene. 

I = 625 (Melnick et a!.)}'' Therefore, and also to avoid 

\ complications due to metabolic saturation, only the 

■ dose-response curve at concentrations at or below 200 

I ppm is considered here.] Figure L plots H(x) against x 

' for the data in Table 1. The resulting curve is noticeably 

concave The exposure concentrations are all well below 
! the lOOO-ppm level at which metabolic saturation is ex- 

1 pected to become noticeable;'' so a simple explanation 

of concavity by saturation of metabolism is not self-ev¬ 
ident. Finding a “best-fitting” .MS model for these data 
would be misguided, since the empirical dose-response 
curve appears to fall significantly outside the MS family. 

If 1,3-butadiene were the only chemical exhibiting 
a significantly concave H{x) function, it might be re¬ 
garded as a curiosity, not wonh modifying the MS 
model for However, a recently completed study of iso- 
prene (Battelle Columbus Labs, in press) shows a 
strongly concave H{x) curve over a range of x values 
that includes 0. 10. 70, 140, 280. 700, and 2200 ppm 
(see Fig. 4), .4s shown in Fig. 4, some well-known 
\ chemicals are better desenbed by concave rather than 

' convex cumulative hazard functions when a family of 

1 models that includes both is fit to e.xperimental data. As 


previously mentioned, Travis et al..'-' as well as other 
authors, have found concave dose-response relations, al¬ 
though some of these may be explained by metabolic 
saturation (unlike the 1 .j-butadiene and isoprene cases(. 
Thus, empincal evidence suggests that realistic dose- 
response models must be flexible enough to allow for 
concave cumulative hazard functions. To this end. it is 
necessary to modify the traditional MS formula. 


3. EXACT ANALYSIS OF A TWO-STaGE 
INITIATION-PROGRESSION MODEL 


3.1. Qualitative Implications of the .Multistage 
Model in an Initiation-Progression Setting 

Before presenting an exact analysis of the rwo-stage 
MS model, it may be useful to illustrate and explain the 
mam points in a simpler setting. This section considers 
an idealized initiation-progression expenment in which 
the roles and interactions of the initiation and progres¬ 
sion stages can be .studied separately. (The MS frame¬ 
work does not consider cell proliferation, so we discuss 
initiation-progression rather than iniliation-promotion.l 
.Animals are exposed to a pure initiator for L penods. 
followed by exposure to a pure progressor for T penods, 
after which terminal sacrifice occurs. .A.s in the MS 
model, the initiator randomly mutates normal stem ceils 
to create initiated cells. The progressor randomly trans¬ 
forms initiated cells into malignant ones. 

If there are imtially N normal stem cells and tf the 
initiation rate is p, mutations per normal cell per urn 
time during the first L penods and the progression rate 
is fi, mutations per initiated cell per unit time during the 
last T penods, then what is the probability that at least 
one malignant cell is formed prior to lentiinai sacnfice? 
The answer, according to the reasoning of the traditional 
MS rnodel (reviewed m the Appendix) is as follows; 

P = 1 - exp(—Pifx-iT/k') 

(Armitage-Doll and MS formula) 

This formula, well known as the Armitage-Doll formula 
for two stages, becomes the MS model when the linear 
(i, are substituted into it. It implies the following con¬ 
clusions; 

(i) The cumulative tumor hazard. H = |j,)iJ.77V. is 
a convex (linear or upward-curving) function of the dose 
when the transformation rates are positive linear func¬ 
tions of dose, i.e., when 

= a, A b,x for j = ), 2. with a, > 0 and 6, > 0 
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Figure I shows that this conclusion may be empirically 
false. 

(ii) A.1! individuals with identical parameters (i.e., 
with identical values for p,. p,, L T. and iV) have the 
same response probability, P. 

(iii) ji, and |ij affect risk symmetricaily, 

fiv) If pii.Vr > 0. then the P approaches 1 as p, 
(and hence H] approach infinity. 

A closer look at the model shows that all four of 
these qualitative implicaiiotis are false. They result from 
simplifying approximations (the “rare-tumor" approxi¬ 
mation discussed in the Appendix) made in the mathe¬ 
matical analysis of the model, rather than being logically 
implied by the model's assumptions. 

A slight modification of the model makes it more 
realistic and helps to simplify its exact mathematical 
analysis. Henceforth, it will be assumed that the number 
of normal stem cells remains appro.ximately constant 
throughout the experiment, e.g.. due to homeostatic reg¬ 
ulation of the normal stem cell compartment. (In the 
Armitage-Doli or MS model, by contrast, normal stem 
ceils that become initiated are not replaced. Of course, 
if initiations are rare and .V is large, the numencal dif¬ 
ference in results made by assuming that .V is depleted 
by initiation instead of assuming that .V is conserved is 
negligible.) Norie of the conceptual points that follows 
depends on this modification (see the .Appendix), but it 
makes the equations developed below more intuitive and 
easier to interpret. 


3.2. \1athematica) .Analy sis of the Model 

The total number of initiated cells, denoted by f, 
formed by random mutations dunng the period of length 
L has a Poisson distnbunon with mean (and variance) 
given by 

£•(/) = Vart,/) ^ p,Z.;V 

= expected number of initiated cells formed. 

The random number of malignant cells formed, denoted 
by ;V/, is also a Poisson random variable, with mean and 
variance of 

£(/Vf) = Var(W) = £(/){! - expi-p;?!] 

= MiLAIl - exp(-p-r)] 

(This follows from the fact that a binomially-sampled 
Poisson random variable is also Poisson with a mean 
equal to the mean of the onginal variable multiplied by 
the sampling probability; see e.g., Ross."" Since each 
initiated cell becomes malignant with the same proba¬ 


bility'. namely. 1 - exp(-,u,71. binomial sampling ap¬ 
plies.) 

Now, consider the tumor probability as p, —> x 
The probability that at least one malignant ceil is formed 
becomes the same as the probability' that / > 0. The 
Poisson distribution of I ir-plies that this probability is 

lim P = Pr>.\ > Ol = I - cxp[-f(ni = 1 - ix,p(-p (.At 

a, = 

Thus, the limiting value of P may lie anywhere between 
0 and 1, depending on the value of PiiA. even if p.T 
> 0. in contrast to the implications of the Airnitage-Doll 
formula. 

The exact probability that M = 0 also follows di¬ 
rectly from its Poisson distribution: 

PriX-f = 0) = exp[-£(/ff)] 

= expi-(p|£,V)[l - cxp(-p,nj} 

The total probability that at least one malignant cell is 
formed (interpreted as the “tumor probability'") is there¬ 
fore 

E{P) = [I - Pr(M = 0)) 

= I - exp{-(n,£V)[l - exp(-p,n]! 
(E.xact formula for 2-stage model) 

It is denoted by EiP) to emphasize that it is the uncon¬ 
ditional probability of mmor; it could also have been 
derived with more effort from the law of total probabil¬ 
ity, as 

E[P) = E[E{P\I')] = 1 ,5 least 1 malignant 

ccltl/ initiated cells) *Pr{! initiated cells)] 

Expressed in term of the cumulative hazard, the tu¬ 
mor probability is 

E{P) = 1 - exp(-//) 

where H. the cumulative hazard, is given by 

H = {p.,,EA^)[t - exp(-|a.r)l = £(/)[] - expC-p.r)] 

H has a direct physical interpretation as lire expected 
number of malignant cells formed. This formula for H 
shows that it does not approach infinity as (i-F does, in 
contrast to the MS rnodei. Instead, it approaches £(/), 
explaining why the probability of tumor approaches a 
value less than 1. Also, as in the binomial case analyzed 
in the appendix. H(x) need not be a convex function of 
X when 

H, = a, -b b,x for i = 1,2, with a, 2 0 and b, > 0 
Figure 2 iilustrates several possible shapes for H{x). 
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Fig. 2. E.i(3/npie dose-response curves for the exaci rwo-stnee initiation-progression model. 



Fig. j. The exact and approximate iwo-siage dose-response models may differ sigrnhcanily at high doses. 


From the preceding expression for E[P). and recal¬ 
ling that 1 - exp(--T) approaches x as .r approaches 0, 
it IS clear that the value of E{P) approaches the MS 
model’s value for P in the “rare-nimor limit" as p,I 
and p-T approach zero. However, this limit is inappro- 
pnate in many bioassay situations. Tumor incidence 
rates commonly exceed 20%. even m the absence of 
exposure (see Table [). Applying the MS model to an¬ 
alyze such data may yield misleading results. Figure 3 
illustrates the potential discrepancy between exact and 
approximate model curves. 

Finally, consider whether individuals with identical 
parameters and doses have identical risks. As just 
shown, the tumor probability P follows a nondegenerate 
frequency distribution in the population, i.e.. it is not a 


constant [equal to E{P)] for all members of the popu¬ 
lation. Specifically, for n individual with / initiated 
cells, the (conditional) tumor probability is 

1 ^ = I - exp{-/ [1 - exp(~p,D]} 

For an individual with an unknown value of [. i.e.. for 
whom / is a Poisson random variable with mean E(D, 
the (unconditional) probability of tumor is just E{P), 
since individual risk is the mean of the population dis¬ 
tribution of risks when only the distribution, but not the 
actual value, of f is known. [This follows from the con¬ 
ditional expectation identity E(P) = £IE(P j /)] (Ref. 6, 
p. 15, Ref. 7, p. 62). C)n the other hand, to a decision¬ 
maker concerned with the population distribution of 
nslcs. it may be important that some individuals (those 
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with high values of f) will be especially at nsk of de¬ 
veloping tumors if exposed to a carcinogen that in¬ 
creases jj,. From this perspective, identically exposed 
individuals do not have identical (constant) risks, but 
rather idenricaUy dtatributed random nsks. This disiinc- 
lion is also potentially useful in analyzing mmor count 
data from high-dose animal expenments, where there 
may be multiple tumors per animal. Among k identical 
exposed animals, the expected number of malipant cells 
created during an initiation-progression expenment, and 
also the vanance of this number, are given by 

k£iM] = A-Var(.t/) = i-( 4 ,Z.V)[l — expl-p,?")] 

To summarize this section, the Armitage-Doll formula 
for rumor probability (or the VIS model, if the transition 
rates are specified to be linear functions of dose) is 

P = \ ~ exp[-^,r‘ £{!)] = I - exp(-u,p,zr.V) 

(Annitage-Doil/MS approximation) 

The exact formula for the tumor probability is 

P - I - exp( -u-r'"/) 

iConditional rumor probability) 

with expected value 

EtP) = I - e.xpl-[|uT,.\][l - expl-u-D]! 

'Exact unconditional rumor probabiiirv) 

For anv specified values of u.Z.Vand pT. the exact for¬ 
mula gives a .smaller risk than the approximate one. The 
numencai difference between them approaches zero as 
u,Z.\ and pTboth approach zero, but may be large for 
the range of u.Z.V and u.T values that are relevant to 
bioassay data analysis. For example. Fig, 3 shows the 
dose-response curves, i.e.. ihe plots of Pt.v) vs. x. from 
ihe exact and approximate models when 

p.i.V = 0.1 and pi,r = O.l.r 

as v ranges from 0 to 200. .Although the curves approach 
each other and essentially coincide for sufficiently small 
doses and mmor probabilities, they differ by more than 
a Factor of 2 for tumor probabilities of 20% and more. 


4. REGRESSION EQUATIONS FOR THE EXACT 
TWO-STAGE MODEL 

The MS model is usually applied to situations in 
which a single chemical carcinogen is thought to act on 
different stages simultaneously, rather than having dif¬ 
ferent chemicals act sequentially on different stages as 
in ihe idealized mitiaiion-progression model. Instead of 
two distinct exposure penods there is only one. whose 


length will be denoted by T. This .seciion derives the 
exact tumor probability |i.e.. the probability that at least 
one malignant cell is created by time D for such cases 
and shows that the main conclusions from the Initiation- 
Dtogression model extend to this setting. 

The probability that at least one malignant cell is 
formed by time T may be d- rived as follows. 

(i) The number of initiated cells created from time 
f = 0 to time t = T. denoted by /, is a Poisson random 
vanable with mean 4,.VT. 

(ill Conditioned on the value of /, the arrival times 
of the initiated cells arc independently and uniformly 
distributed over the interval from 0 to T (Ross. 1983. p. 
37). The probability that an initiated cell created at time 
/ < T has no! undergone a second (malignant) transition 
by lime T is expf-udr - n]. Integrating over ail times 
! in (0. T) with respect to the uniform density 1 T shows 
that the total probability that an initiated cell created 
between 0 and T survives until time T without becoming 
malignant is given by the following quantity: 

-V = [1 - expi-it.nj.'ti.r 

= Prtinitiated cell does not become malignant) 

(hi) The probability that none of the / initiated cells be¬ 
comes malignant Is s'- Therefore, the conditional rumor 
probability (that at least one cell becomes malignani) is 

P = 1 - .V' 

liv) The unconditional mmor probability (averased over 
the conditional probabilities for all values of / weighted 
by theif Poisson probabilities) is 

E{P) = 1 - E(j') = 1 - exp[-tl - s)a,.V] 

(Exact two-stage model formula) 

where the expectation follows directly from the moment 
generating function for a Poisson random vanable. .As 
in the initiation-progression model, this formula has a 
clear physical inierpretation as 

individual tumor probability = 1 
- exp[-(expected number of malignant cells formed)] 

This completes the derivation of the rumor probability. 

To apply these formulas to risk analysis, it is nec¬ 
essary to include dose explicitly and to develop methods 
for estimating the unknown quantities from data. Dose 
may be included as follows. Define the dose-dependent 
cumulative hazard at time T as 

H{x') = (1 - s)iJ.,(w.)AT 

The formula for s and algebraic rearrangement yield 
W(.r) = [)i,(jr)A'71[)l:(.r)7'- 1 ^ exp<-p,(x)r)]/)U;(.r)7' 
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= ,V[iJ.|(jf)/p,(jr)]{P;(.r)r - I -r exp[-p,(,cir]; 

= * b^xVia. r + b.x)T ~ I - 

exp[—(a, 1 - 

Now, recalt that H{x) = -ln[l - /‘(x)J, where Plx] de¬ 
notes the tumor probability at dose.v. Equating these rwo 
formulas for H(x) gives the regression model 

H(x) = /V{(a, + b|,v)/(a, + b,.t)] 

{(a. -X b.x)r - 1 -I- exp[-(tj; + 

As in the initiation-progression model. H{.r) may be ei¬ 
ther convex or concave, depending on the (nonnegative) 
values of its parameters, In this regression model, the 
quantities x and P{x) (and hence -ln[l - are 

observed, although the observed value of P{x), namely 

P'-ix) = n(.r)/.V(,r) 

where n[x) = number of responding animals exposed to 
X and N\x) = total number of animals exposed to c. 
contains binomial sampling error. From the data points 
F‘{x) for several dose groups, .r, it is desired to estimate 
the parameters of the regression mode). 

it may appear that there are six unknown constants 
to be estimated, namely (,V, a,, 6,. a,, b,. T). Hoivever. 
,V can be absorbed into the constants a, and h,. since it 
muliipiies them. Similarly, any fi.xed T can be absorbed 
into a_, and 6. (and into a, and h- in the first term, on 
iTiultipiying by T V). These simplifications reduce the 
quantal regression model to the follow-ing four-parame¬ 
ter family: 

H{x) = [(a, -h b■,x\l^a. - - b~x) - ! 

* exp[~ia; H- b.,T)]j (Regression model) 

The four parameters of this "reduced model" are scaled 
\'ersions of the corresponding original parameters. They 
can be estimated by any of several nonlinear regression 
techniques.'*' e.g.. using a maximum-likelihood or a least 
squares criterion. 

A simple least-squares procedure is as follows. 
Given a data set of [x. P"(.x)] pairs, transform the ob¬ 
served P'\x) values to obtain the corresponding values 
for 

H*{x) = -Int! - P*{x)] 

Define the sum of squared errors for parameter vector jj 
= {a,, b,. a,, b.) and dataset D as 

SS{P!D) = I,[iV*(.x) - H(x-. P)!= 

where H(x', (3) represents the value for H(x) predicted by 
the regression equation when parameter vector p is as¬ 
sumed, Use a nonlinear optimization software package 
(several are available in commercial statistical packages 


and in engineering software such as M.ATL.AB) to find 
a value of 3 that minimizes 5'i'(3 | D). The result is an 
ordinary least squares (OLS) estimate of 0. 

More sophisticated procedures may also be imple¬ 
mented quite easily using current commercial software. 
For example, a weighted least squares (WLS) algorithm 
would modify the S5(3 I D) cntenon by weighting its 
individual terms, e.g„ according to the reciprocals of 
their sample variances based on the numbers of animals 
in each group. Maximum likelihood estimation (.MLE) 
would compute the likelihood of data set D for different 
values of the parameter vector 3 and choose a value of 
3 to maximize it. Maximum a posteriori (.M.APi Bave- 
sian estirnation would condition a pnor probability' den¬ 
sity function for 0 on the observed data and select the 
value of 3 corresponding to the mods of the resulting a 
posteriori density function, assuming that it is uniquely 
identifiable from the data. In many applications, these 
different procedures lead to slightly different estimates 
of 3 but to very similar dose-response curves. (They 
also lead to different procedures for constructing confi¬ 
dence regions for 0).'”’ This paper will use the OLS pro¬ 
cedure, as it IS simple to understand and implement. 
MAP approach to confidence region construction is de¬ 
veloped in a separate paper. 

Before turning to analyses of .some e.xample bio¬ 
assay datasets, it is worth noting the following asymp¬ 
totic properties of H[x). 

1. As .r — > 0. H(x) —r {ay'a,)[a. - 1 - exp(-u.j]. 
assuming that a, > 0. 

2. As a, —> with all other terms remaining 

bounded, H(0) -* a,. 

3. As .r H{x) —?■ b,x. 

Thus, if only the no-dose (x = 0) and several high-dose 
groups are examined, it will be impossible to estimate 
i); [since it does not affect W(0) or H{x:)] or to uniquely 
estimate a, and a, [since several different combinations 
of their values give the same value of //(O)], These dif¬ 
ficulties are referred to as ill conditioning and noniden- 
tifiability of parameters, respectively (Ref. 8, pp. 102- 
126). Even when intermediate values of H(x) are avail¬ 
able. the model may be modestly ill-conditioned, in chat 
different choices of 0 approximately minimize 5S(p | D). 
However, to the extent that all 0 values that come close 
to minimizing 55(|3 | D) give nearly identical bf{x: 0) 
functions over the full range of x values, the lack of a 
uniquely identifiable "best” parameter estimate is irrel¬ 
evant to risk assessment. 

The example curves in Fig. 2 illustrate several of 
these points. (The same qualitative properties apply 
equally well to the solutions of the initiation-progression 
model and the two-stage model with simultaneous ef- 
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Fig, 4. Shapes oi' siage gipdei dose-resppnsp curves for several 
example data sets. 



fects on both transition rates; see Fig. 4.) Curve 1, 
generated by the parameter veetor ^ = (0.5, 0.01, 10, 
1). IS a nearly straight line with an intercept of approx¬ 
imately 0.5. From data points falling along this line, it 
would be impossible to determine the value of a,, since 
changing its value from 10 to 30, for example, would 
produce no discernible difference in //(.r) or in the dose- 
response function (it is not rate-limiting). This is an ex¬ 
ample of ill-conditioning (many different parameter 
vectors produce almost the same W(.r) curs'e), but it is 
innocuous, since the risk at any given dose is insensitive 
to the value of o,. 

Curve 2, corresponding to the parameter vector (3 
- (0.5. 0.01. 0. 1). illustrates a greater difficulty. It es¬ 
sentially coincides with curve 1 at all doses above 10, 


Table li. Fit of ihc E-ract Tno-S(age Model Re«res.sian Model to 
Bioassay Datasets for Five Chemical Carcmoeetis 


Chemical 

Vanojice explained 
by regression 

1,3-Sutadiene 

9^ 9% 

Chloroform 

98 1% 

Fonnaldehyde 

RV J«, 

ls(5prene 


3(aiP 



but dives steeply toward zero at low doses. Although an 
observed data point at .r = 0 (the control group) will 
discriminate between these two curves, manipulating the 
value of A. controls the location of the “knee" at which 
the high-dose slope (controlled by h,) starts curving 
down toward the ongin. A dose-respon.se experiment 
that collects data only from exposure levels that bracket 
the "knee" (i.e„ from the control group and from ex¬ 
posure groups with .r values corresponding to the ap¬ 
proximately linear segment) will make it impossible to 
estimate ft.. leaving the location of the "knee" and of 
virtually safe doses uncenain. 

Curves 3 and 4. corresponding to [3 = (0, 0.02. 0. 

0.01) and fl = {0. 0 02. 0. 1). respectively, illustrate 
convex and nearly linear behaviors of the cumulative 
hazard function. 


5. EX.4.\1PLES analyses for several 

DATASETS 


The exact two-stage model developed in the pre¬ 
vious section has been fit to experimental bioassay data 
for several chemicals using the OLS technique. Figure 
4 shows representative results for five chemicals. 1.3- 
butadiene. chloroform, formaldehyde, isoprene. and 
8(a)P. The formaldehyde, 8(a)P, and chloroform daia- 
.sets were taken from EPA's IRIS database, while the 
i,3-buiadiene and isoprene data were taken from the 
sources already cited. (Details of the datasets, including 
the dose metrics and the number of animals in each dose 
group, as well as the estimated parameter values, are 
available from the author or from these sources. Here, 
we only want to indicate the applicability of the exact 
two-stage model to some real datasets, rather than dis¬ 
cussing these particular examples ip depth.) Usmg the 
percentage of variance explained by the nonlinear re¬ 
gression model as a enterion, the fit of the model to the 
five datasets is summarized in Table II. 
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The two-stage model explains a substantia! propor¬ 
tion of the variance in response frequencies in each case, 
as would be expected simply from the fact that dose is 
positively associated with response m both the model 
and the datasets. However, the quality of fit differs 
sharply among chemicals. It is worst for Bi'ajP. This 
might be expected a priori on biological grounds, since 
B/ajP is a potent promoter and the MS model (including 
the two-stage special case) ignores promotion based on 
cell proliferation. It is somewhat reassuring that the ex¬ 
act model is not flexible enough to provide a good fit to 
data that violate its mechanistic assumptions. 

In summary, the exact two-stage mode! appears to 
provide useful fits to the bioassay data for the chemicals 
m Table 11 other than B<a)P. Since it is based on specific 
mechanistic hypotheses (and since it does not make sim¬ 
plifying mathematical approximations that would allow 
u to fit datasets that are inconsistent with its assump¬ 
tions), it IS to be expected that the model will be inap¬ 
propriate for some chemical carcinogens—especially, 
for those that act solely or primarily by expanding the 
net growlh rate of the initiated population. For such car¬ 
cinogens. the modeling approach illustrated in this paper 
must be extended to obtain exact formulas for situations 
where dose-dependent birth-death parameters are im¬ 
portant. 
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.APPENDIX: DERIVATIONS OF EQUATIONS 

The Traditional Multistage Model with Two Stages 

[f there are initially H normal stem cells and if the 
initiation rate is p, mutations per normal cell per unit 


time dunng the first L periods and the progression rate 
is p, mutations per initiated cell per unit time dunng the 
last T penods, then the probability that a normal stem 
cell survives until terminal sacrifice without becoming 
malignant is 

F'r(celi IS never initiated) -b Prfceli ts initiated but not 
transformed I 

= exp(-p,T) -b [1 - exp(-p,Z.)](exp(-p.r)l 

The probability that a normal stem cell does become 
malignant is therefore 

p = Pr(a specific cel! becomes malignant I 
= 1 - exp(-p,Z,) - [I - exp(-p,L)] 
[expl-p-D] = [I - exp(-p,/.)![! - exp(-p,nj 

If PiZ. and p-T are sufficiently small Uhe rare-rumor ap¬ 
proximation), then this expression is well approximated 
numencally as 

P = = PiP-ir 

The probability that at least one of the iV cells becomes 
malignant, which is given exactly by the binomial prob¬ 
ability 

P = 1 - (1 - pY 

is numerically well approximated, for sufficiently small 
p (and therefore for sufficiently small p,/, and p.T. by 
the formula 

P = 1 - exp(-.Vp) = 1 - exp(-p,p^77Vl 
If it is assumed that 

p, = a, + b,x 

where x is the administered dose, then 

P = 1 — exp[-(a, -b b,.T)(o, b,x)LTN] 

= 1 — exp[—(ijo -i- q,x -b (MS Model) 

where = a^a^LTN, g, = ( 0 ,^, -f- a,b,)LTN, and q, = 
b-^b-LTN. Since the a, and 6, are nonnegative, the q, will 
be nonnegative, too. 

Exact Analysis of the Two-Stage Initiation- 
Promotion Mode) 

The random number of initiated cells formed during 
the initiation penod, denoted by /, has a binomial dis- 
tnbution with parameters N and 1 - exp(-p|Z.). The 
conditiona! probability that none of them becomes ma¬ 
lignant ts 

Pr(no malignant cells formed j / initiated cells) - 
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[exp(-[a,,71J'' = [exp(-|a,r/)] 

which has the unconditional expected value 

Pr{t\o malignant cells formed) 

= £}exp(-p,77)] 

= {[1 “ exp('-ii,i)]exp('j4.;71 + expl-iiii)}' 

== [expl-)i,r) - exp(-n,i.)exp(~)i;r) 4- expt-p,/.)]■■■ 

(This may be confirmed either by noting that the ex¬ 
pected value term is essentially the moment generating 
function of the binomial distribution of /, or else bv rec¬ 
ognizing that the expression may be interpreted as the 
probability that, for each of the .V cells, either no initi¬ 
ation occurs or initiation occurs but progression does 
not.) The probability that at least one malignant cell is 
formed is therefore 

P = 1 - [cxp(-p_,n - exp(-4|/i)exp(-p,r) -t- 
exp(-p,£)]' 

The corresponding cumulative tumor hazard is 
H = ~In([-P) ^ 

~.V* ln[exp(-p.n - exp(-p.,£)exp(-p-n 

— e\p(-p Z.il 

When expressed as a function of dose, .v, assuming that 

p, (V, - h,x ^ for I = I, T with a, > 0 and b, > 0 

this function may be concave teg., choose u,£ = D.Ol 
and p.7" = 0.0) -- 0.l.t) or convex (choose p.i = u,r 


= 0.0 l.r). Thus, solving the model exactly instead of 
approximately (via the rare-tumorapproximation) shows 
that the dose-response relation for cumulative njmor 
hazard vs, dose may be concave, contrary to the predic¬ 
tions of the approximate model. 
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